3 research outputs found
Diffusion Models for Constrained Domains
Denoising diffusion models are a recent class of generative models which
achieve state-of-the-art results in many domains such as unconditional image
generation and text-to-speech tasks. They consist of a noising process
destroying the data and a backward stage defined as the time-reversal of the
noising diffusion. Building on their success, diffusion models have recently
been extended to the Riemannian manifold setting. Yet, these Riemannian
diffusion models require geodesics to be defined for all times. While this
setting encompasses many important applications, it does not include manifolds
defined via a set of inequality constraints, which are ubiquitous in many
scientific domains such as robotics and protein design. In this work, we
introduce two methods to bridge this gap. First, we design a noising process
based on the logarithmic barrier metric induced by the inequality constraints.
Second, we introduce a noising process based on the reflected Brownian motion.
As existing diffusion model techniques cannot be applied in this setting, we
derive new tools to define such models in our framework. We empirically
demonstrate the applicability of our methods to a number of synthetic and
real-world tasks, including the constrained conformational modelling of protein
backbones and robotic arms
Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions
Accelerating the discovery of novel and more effective therapeutics is an
important pharmaceutical problem in which deep learning is playing an
increasingly significant role. However, real-world drug discovery tasks are
often characterized by a scarcity of labeled data and significant covariate
shift\unicode{x2013}\unicode{x2013}a setting that poses a challenge to
standard deep learning methods. In this paper, we present Q-SAVI, a
probabilistic model able to address these challenges by encoding explicit prior
knowledge of the data-generating process into a prior distribution over
functions, presenting researchers with a transparent and probabilistically
principled way to encode data-driven modeling preferences. Building on a novel,
gold-standard bioactivity dataset that facilitates a meaningful comparison of
models in an extrapolative regime, we explore different approaches to induce
data shift and construct a challenging evaluation setup. We then demonstrate
that using Q-SAVI to integrate contextualized prior knowledge of drug-like
chemical space into the modeling process affords substantial gains in
predictive accuracy and calibration, outperforming a broad range of
state-of-the-art self-supervised pre-training and domain adaptation techniques.Comment: Published in the Proceedings of the 40th International Conference on
Machine Learning (ICML 2023
Metropolis Sampling for Constrained Diffusion Models
Denoising diffusion models have recently emerged as the predominant paradigm
for generative modelling. Their extension to Riemannian manifolds has
facilitated their application to an array of problems in the natural sciences.
Yet, in many practical settings, such manifolds are defined by a set of
constraints and are not covered by the existing (Riemannian) diffusion model
methodology. Recent work has attempted to address this issue by employing novel
noising processes based on logarithmic barrier methods or reflected Brownian
motions. However, the associated samplers are computationally burdensome as the
complexity of the constraints increases. In this paper, we introduce an
alternative simple noising scheme based on Metropolis sampling that affords
substantial gains in computational efficiency and empirical performance
compared to the earlier samplers. Of independent interest, we prove that this
new process corresponds to a valid discretisation of the reflected Brownian
motion. We demonstrate the scalability and flexibility of our approach on a
range of problem settings with convex and non-convex constraints, including
applications from geospatial modelling, robotics and protein design